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ABSTRACT:   A  general  class  of  tests  designed  to  detect  conditional  mean 
misspecif ication  for  cross  section  or  time  series  applications  is  proposed. 
The  tests  are  derived  from  a  particular  application  of  the  encompassing 
principle.   The  resulting  conditional  mean  encompassing  (CME)  tests  contain 
as  special  cases  a  version  of  the  Lagrange  Multiplier  test  for  nested  models, 
a  new  test  in  the  presence  of  nonnested  alternatives,  and  a  version  of  the 
Durbin-Wu-Hausman  test  that  compares  two  weighted  nonlinear  least  squares 
estimators.   The  tests  are  valid  without  any  assumption  on  the  conditional 
variance  of  the  dependent  variable  and  can  be  computed  using  any 
/T- consistent  estimators.   Moreover,  CME  tests  for  nonlinear,  dynamic  models 
are  computable  from  linear  least  squares  regressions. 


1.  Introduction 

This  paper  develops  a  general  class  of  tests  intended  to  detect 
misspecif ication  of  a  conditional  expectation  for  cross  section  or  time 
series  models.   The  approach  is  based  on  the  encompassing  principle  (Hendry 
and  Richard  (1982),  Mizon  (1984),  and  Mizon  and  Richard  (1986))  in  the  sense 
that  it  exploits  certain  implications  of  estimating  an  alternative  model  when 
the  model  taken  to  be  the  null  is  true.   However,  for  nonlinear,  dynamic 
models  the  present  application  of  the  encompassing  principle  results  in 
"conditional  mean  encompassing"  (CME)  tests  that  are  more  operational  than 
the  "complete  parametric  encompassing"  (CPE)  tests  proposed  by  Mizon  and 
Richard  (1986).   In  particular,  there  is  no  need  to  solve  for  the 
"pseudo-true  value"  of  the  estimator  from  the  alternative  model,  nor  does  one 
need  to  compute  the  null  limiting  distribution  of  the  estimator  in  the 
alternative  model.   Both  of  these  tasks  can  be  difficult  in  the  context  of 
nonlinear  regression  with  dependent  observations. 

The  main  results  of  the  paper  can  be  briefly  summarized.   For  nested 
hypotheses  the  CME  test  is  asymptotically  equivalent  to  the  Lagrange 
Multiplier  (LM)  test.   For  nonnested  models,  the  CME  test  is  based  on  the 
correlation  between  the  residuals  under  the  null  and  the  gradient  of  the 
alternative  regression  function.   The  results  of  Wooldridge  (1988)  are 
applied  throughout  to  produce  tests  that  can  be  computed  from  linear 
regressions  but  do  not  maintain  homoskedasticity  or  other  second  moment 
assumptions  under  Hn .   Further,  the  test  statistics  can  be  computed  using  any 
/T-consistent  estimators.   These  features  make  the  tests  applicable  in 
situations  more  general  than  the  usual  LM  test  and  standard  tests  of 


nonnested  hypotheses. 

When  the  approach  is  extended  to  weighted  nonlinear  least  squares  (WNLS) 
estimation,  a  statistic  that  is  asymptotically  equivalent  to  the 
Durbin  (1954)  -  Wu  (1973)  -  Hausman  (1978)  (DWH)  statistic  that  compares  two 
WNLS  estimators  can  be  shown  to  be  a  special  case.   Again,  because  this  test 
is  a  special  case  of  the  general  approach,  the  form  of  the  statistic  proposed 
here  is  regression-based  but  does  not  require  either  estimator  to  be 
relatively  efficient  under  Hft.   Additional  robust  tests  in  the  presence  of 
nonnested  alternatives  are  available  from  WNLS  estimation. 

2.  Setup  and  Motivation 


Let  ( (y   z  ):  t— 1,2,...}  be  a  sequence  of  random  vectors  where  y   is  a 
scalar  and  z   is  a  lxK  vector  of  conditioning  variables.   In  a  time  series 
co 


x 

t 


ntext,  let  xfc  =  C^^t-l 'Vl ^'V  °r  Xt  "  ^t-l'Vl ^l'V  °r 

53  (yr    -]>yr   o v-i )  be  the  set  of  predetermined  variables.   The  choice  of 

x   depends  on  whether  there  are,  in  addition  to  past  values  of  y  ,  other 
conditioning  variables  {z  },  and  on  whether  or  not  the  researcher  wishes  to 
condition  on  contemporaneous  z  .   Including  the  entire  observed  past  history 
of  { (y   z  )}  or  (y  )  in  x  restricts  the  analysis  to  cases  where  one  is 
interested  in  getting  the  dynamics  of  the  conditional  mean  correctly 

specified.   In  a  cross  section  context,  set  x  =  z      and  assume  that  the 

t     t 

observations  are  independently  distributed. 

Suppose  that  one  is  considering  the  following  parametric  model  for 


E(yt|x  ) 


{mt(xt,Q):  a   e  A,  t=l,2,...},  A  C  RP.  (2.1) 


The  null  hypothesis  that  (2.1)  is  correctly  specified  for  E(y  |x  )  is  stated 
formally  as 

H  :  E(y  |x  )  =  m    (x  ,q  ) ,  for  some  a     S  A,  t-1,2, . . .  (2.2) 

A  general  approach  to  testing  the  validity  of  H-  is  to  compare  the 
performance  of  model  (2.1)  in  light  of  alternative  parametric  specifications 
for  E(y  |x  ) .   Let 

(Mt(xt,£):  P   6  B,  t-1,2,...},  B  c  RQ  (2.3) 

be  another  such  parametric  family.   In  what  follows,  m  may  or  may  not  be 
nested  within  /i  .   The  idea  is  to  look  for  departures  from  Hn  in  the 
"direction"  of  model  (2.3),  which  for  convenience  is  labelled  the 
"alternative  model."   It  is  sometimes  useful  to  refer  explicitly  to  the 
specific  alternative  hypothesis  H  : 

Hx:  E(yt|xt)  =  ^(x^^),  for  some  0q   e  B,  t-1,2 

Under  H   and  standard  regularity  conditions  the  nonlinear  least  squares 

A 

estimator  a      is  weakly  consistent  for  a    ;  under  further  regularity 

A 

conditions,  /T(q_  -  a   )  is  asymptotically  normal  or  at  least  0  (1).   Explicit 
T    o       J  J  p 

regularity  conditions  are  not  provided  here;  one  set  of  sufficient  conditions 
is  contained  in  Wooldridge  (1988). 

A 

Whether  or  not  H   is  also  true,  the  NLS  estimator  /3   for  model  (2.3)  can 

be  computed  by  solving 

T  2 

min  I   (y   -  „(x  ,/3))\  (2.4) 

P&   t-1 

A 

White  and  Domowicz  (1984)  have  shown  that,  under  Hn,  /3   generally  converges 
in  probability  to  a  nonstochastic  sequence  (/3  :  T-1,2,...)  c  B  which  has  the 


following  optimality  property:  {3     solves  the  nonstochastic  minimization 

problem 

T 
min  I   E[(y   -  «  (x.  ,/3))2]  .  (2.5) 

,0GB  t=l 

A 

Further,  /T(/?   -  ft    )  typically  has  a  limiting  normal  distribution.   When  the 

A 

models  are  nonnested  the  asymptotic  covariance  matrix  of  /T(/9   -  /3  )  can  be 
fairly  complicated,  especially  in  time  series  contexts.   (This  is  because  the 
implied  errors  {y   -  p  (x  ,/3  ) :  t=l,.,T)  do  not  constitute  a  martingale 
difference  sequence  under  H  ■  thus,  they  are  usually  serially  correlated). 
Tests  that  require  calculation  of  the  asymptotic  covariance  matrix  of 

A 

/T(/3   -  0   )  and/or  characterization  of  the  pseudo-true  value  function  /9  = 
b  (q  )  are  unattractive  from  a  computational  viewpoint.   The  complete 
parametric  encompassing  tests  of  Mizon  and  Richard  (1986)  have  this  feature 
for  nonlinear  dynamic  models.   The  next  section  develops  simple, 
regression-based  tests  which  require  only  that  /T(/?   -  /3  )  =  0  (1)  and 

A 

/T(q_  -  a   )  -  0  (1)  under  H. . 
lop  0 

3.  A  New  Test  Based  on  the  Encompassing  Principle 

The  basis  for  the  tests  derived  here  is  the  statistical  optimality  of 
the  sequence  (/3  :  T=l ,  2  ,  .  .  .  }  .   In  particular,  the  idea  is  to  exploit  the 
testable  implications  of  /9   solving  the  minimization  problem  (2.5);  it  is 
here  that  the  encompassing  principle  is  invoked. 

Define  the  residual  function  for  model  (2.1)  as  e  (a)  s  y.  -  m  (x  ,a) . 

t  u        t    t 

Under  Hn  the  true  errors  e  =  e  (a   )  are  defined  and  E(e  |x  )  =0. 
0  t    t   o  t'  t 

Therefore,  under  H_ , 


E[(yt  -  Mt(xt,£))2]  -  E[(mt(xt,oo)-  Mt(xt,/3))2]  +  E[(e°)2]       (3.1) 

-  E[(m  (a  )-  »(/3))2}    +   E[(e°)2]. 
tot  t 

Assume  that  fi    (x  ,■)  is  dif  ferentiable  on  int(B)  ,  {/S°:  T-1,2,...)  C  int(B) 

uniformly  in  T,  and  derivatives  and  expectations  can  be  interchanged.   Then 

ft     must  solve  the  first  order  condition 

1  T 
T   I   E[7  /it(^°)'(mt(Qo)-  /it(£°))]  -  0,  (3.2) 

where  V  u  (y9)  =  V  u  (x  ,/3)  is  the  lxQ  gradient  of  fi    (x  ,/3)  .   Equation  (3.2) 
is  a  testable  implication  of  performing  NLS  on  model  (2.3)  when  Hn  is  true. 
To  operationalize  (3.2),  remove  the  expectations  operator  and  replace  the 
unknown  values  a     and  R     by  consistent  estimators  under  H_ .   Initially,  let 

A  A 

q   and  0     denote  the  NLS  estimators;  however,  the  robust  testing  procedure 

A  A  A 

subsequently  derived  is  valid  if  a      and  /?   are  any  estimators  such  that  /T(q 

a 

-  a   )  =  0  (1)  and  /T(5_  -  fl°)  =  0  (1)  under  H. .   In  certain  cases  it  is 
op  lip  U 

computationally  convenient  to  use  estimators  other  than  NLS  for  both  the  null 
and  alternative  models.   An  example  is  provided  in  section  4. 
A  computable  statistic  is  the  Qxl  vector 

-]     1  A  A  A 

T"   I  v^Mt(^T)'  [mt(QT)-  Mt(/?T)]  (3.3) 

-1     -1-  A  A  A 

=  "  T"  I   V^t(^T)'[(yt  -  n.t(QT))  -  (yt  -  Mt(/?T))]      (3.4) 

=  -  I'^Vt^'S  (3-5) 

where  e  *  y^  -  ni  (x  .<*_ )  ,  t=l,...,T  are  the  residuals  for  model  (2.1). 
t-    c    tz   r   i 

A 

Equation  (3.5)  follows  from  (3.4)  and  the  first  order  condition  for  8„. 


Thus,  the  optimality  criterion  leads  to  a  test  based  on  the  covariance  of  the 
gradient  of  the  alternative  regression  function  (i     and  the  residuals  fitted 
under  H    The  statistic  (3.5)  is  seen  to  be  of  the  conditional  moment  form 
analysed  by  Newey  (1985),  Tauchen  (1985),  White  (1987),  and  others.   In  order 
to  distinguish  the  tests  based  on  (3.5)  from  the  complete  parametric 
encompassing  tests  of  Mizon  and  Richard  (1986),  the  former  will  be  called 
"conditional  mean  encompassing"  (CME)  tests.   A  CME  test  is  simply  a 
Newey-Tauchen-White  conditional  moment  test  using  V  fi      as  the 
misspecif ication  indicator. 

In  a  nested  hypotheses  framework,  where  m  (x  ,a)    =  \x    (x  ,r(a))  for  some 
differentiable  function  r:  A  -*  B,  the  statistic 

T 


A         A 


T"1/2tIiVA(^T)'et  (3.6) 

is  closely  related  to  the  statistic  underlying  the  Lagrange  Multiplier  test. 

A  A 

Equation  (3.6)  leads  exactly  to  the  LM  test  if  /3„  in  V  ai    {ft    )  is  replaced  by 

T     p  t   T 

A  A 

the  constrained  estimator  r(a  ) .   When  the  unconstrained  estimator  /9   is  used 
the  resulting  statistic  is  asymptotically  equivalent  to  the  LM  statistic 
under  the  null  hypothesis  and  under  local  nested  alternatives. 

In  general,  even  if  m   is  not  nested  within  p.    ,  (3.5)  is  the  covariance 
that  arises  in  the  construction  of  the  LM  statistic  for  testing  exclusion  of 

A 

V  /j  (/3  )  in  the  regression  (2.1).   More  precisely,  consider  the  LM  test  for 

S      =  0  in  the  artificial  regression  model 
o  ° 

A 

Yr  -  m   (x   ,a  )   +  V  n    (p  )S       +     error  (3.7) 

t  utO  ptiO  t 

2  2 

One  candidate  test  statistic  is  the  TR   form  of  the  LM  test,  where  R   is  the 

u  u 

uncentered  r-squared  from  the  regression 


0 


e.    on   V  m   V  «     t=l ,T.  (3.8) 

t         a  t   p  t 

A 

Unfortunately,  even  when  a   is  the  NLS  estimator  of  a  ,  the  resulting 
statistic  does  not  always  have  a  limiting  chi-square  distribution  under  H 
It  is  true  that  if  a      is  the  NLS  estimator  then  under 

HI:  Hn  holds  and  V(y  |x  )  =  a2  ,    some  a2   >   0,  t-1,2 (3.9) 

0    0  t   t     o        o 

2  2 

TR   obtained  from  (3.8)  has  an  asymptotically  v   distribution  (assuming  that 

V „u      does  not  contain  redundancies  with  respect  to  V  m  ) .   However,  the 
P^t  y  at' 

assumption  of  conditional  homoskedasticity  under  H   is  frequently  implausible 
in  economic  applications,  especially  when  y   is  a  nonnegative  variable. 
Further,  by  definition,  a  conditional  mean  hypothesis  imposes  no  restrictions 
on  the  conditional  variance.   One  goal  of  this  paper  is  to  develop  tests 
based  on  (3.5)  that  do  not  make  additional  second  moment  assumptions  under 
H_.   This  is  straightforward  since  the  statistic  (3.5)  is  of  the  general  form 
that  I  have  considered  elsewhere  (Wooldridge  (1988)).   The  following 

A  A 

procedure,  which  first  purges  from  V  p   its  linear  projection  onto  V  m  ,  is 

p  t  a   t 

valid  under  the  regularity  conditions  of  Theorem  2.1  in  Wooldridge  (1988): 


PROCEDURE  3.1: 

A  A 

(i)   Obtain  a  and  /3     by  NLS,  or  some  other  procedure  such  that 

A  A  A 

/T(q      -   a   )   =  0    (1)    and  /T(p\_   -    0°)    =  0    (1).      Save    the   residuals   e      =  y      - 
lop  lip  C  c 

A  A  A  A  A  A 

m,-(x    ,0    and  the   gradients  V  in     =  V  m    (q_)    and  X     =  V  n     ■  V  p.    (p   )  ; 
tci  tt    t  tt    t      1  tptptl 

(ii)      Run   the  multivariate   regression 

A  A 

A     on   V  m       t-1, . . . ,T 
t  at 

A 

and  save  the  lxQ  vector  residuals,  say  {_; 


(iii)   Run  the  regression 

A    A 

1    on   e  £      t=l, . . ,T 

tst        ' 

2  2 

and  use  TR  =  T  -  SSR  as  asymptotically  *   under  Hn,  where  SSR  is  the  sum  of 

squared  residuals.   Let  V  m°  =  V  ni  (a  )  ,  A°  ■  V.u  =  V  u  (/9„)  and  define 

QtQtOtptptl 

{£  :  t=l,...,T)  to  be  the  residuals  from  the  population  regression 

A°    on   V  m°,  t=l , . . . ,T,  (3.10) 

t  at 

-1  T 
and  let  E  =  T"   £  v(e°£°) •   If  (S„:  T-1,2,...)  is  not  uniformly  positive 

t=l    t  C 

A 

definite  in  T  for  T  sufficiently  large  then  some  elements  of  A   are  redundant 

A  A 

with  respect  to  V  m  ;  the  redundant  elements  in  A   should  be  discarded  and 
q  t  t 

the  degrees  of  freedom  reduced  accordingly.   " 

Typically  it  is  obvious  upon  inspection  whether  redundancies  appear  in 

A 

V  u  .   A  simple  instance  is  when  both  models  are  linear  and  contain 
overlapping  regressors ,  a  case  considered  more  fully  in  the  following 
section. 

The  robust  procedure  not  only  has  a  limiting  chi- square  distribution 
under  H   in  the  presence  of  heteroskedasticity  (conditional  or  unconditional) 
of  unknown  form,  but  it  also  remains  asymptotically  efficient  in  the  event 
that  V(y  |x  )  is  constant.   More  precisely  it  is  shown  in  Wooldridge  (1988) 
that  under  alternatives  local  to  H   that  maintain  conditional 

homoskedasticity ,  the  robust  form  of  the  test  is  asymptotically  equivalent  to 
the  more  traditional  regression  test  (3.8);  robustness  is  obtained  without 
sacrificing  asymptotic  efficiency  under  ideal  conditions.   It  follows  that 
any  asymptotic  power  calculations  under  local  alternatives  and 
homoskedasticity  for  the  nonrobust  statistic  also  hold  for  the  robust 


statistic.   But  the  robust  test  has  the  further  advantange  of  having  an 
asymptotic  noncentral  chi-square  distribution  under  alternatives  local  to  Hn 
when  heteroskedasticity  is  present. 

Derivation  of  the  limiting  distribution  of  the  CME  statistic  under 
alternatives  local  to  H„  is  fairly  standard  and  is  only  sketched.   For  the 
present  purposes,  a  sequence  of  local  alternatives  to  H_  is  characterized 
by  a  sequence  of  minimizers  (a  •  T=l,2,...)  of 


T 
min 
aeA     t-1 


T 
T"1  I   E[(yt  -  mt(xt,a)): 


satisfying  /T(o_    -    q._)    =0    (1),    /T(a*    -   a    )    =  0(1),    and 
lip  1  o 

T 
T"1/2    I   E[7^t(^)'et(Q*)]    -   0(1). 

Letting  £   again  denote  the  residuals  under  Hn  from  the  population  regression 

of  At  on  V  m  (a  ) ,  under  standard  regularity  conditions  it  is  straightforward 
t     a   t  o  °  J  ° 

to  show  that 

rl/2lrl      -  T-WlC-.*    +    o  (i) 

tt  "    t    t  T> 

t=l  t-1  p 

under  the  sequence  of  local  alternatives.   Thus,  letting  n     = 

1/2  T  T 

T*  /      I  E(Z°'e*)    =  0(1)  and  H°  -  T"1  I   V(eV)  (E°  is  computed  under  Hn)  ,  it 

t-1  t-1 

follows  that  the  CME  test  has  a  limiting  noncentral  chi-square  distribution 

.  ,  _  *   o-l  * 

with  sequence  of  noncentrality  parameters  (tt  'E   n    } .       For  particular 

* 

alternatives  7r   can  be  further  simplified,  but  this  is  not  attempted  here. 

Incidentally,  unlike  the  robust  test,  the  local  distribution  of  the  nonrobust 
test  under  heteroskedasticity  is  typically  unknown. 


Another  useful  property  of  the  robust  procedure  is  that  it  is  valid  when 
any  /T-consistent  estimator  of  a   is  used  in  step  (i) .   This  is  in  contrast 
to  traditional  testing  procedures,  where  the  limiting  distributions  of 
statistics  typically  depend  on  the  limiting  distribution  of  /T(a   -  a    )  (an 
exception  is  Neyman's  C(q)  test).   This  added  flexibility  of  the  robust 
procedure  allows  simple,  regression-based  tests  in  situations  where  standard 
approaches  can  be  computationally  difficult. 

It  should  be  emphasized  that  the  CME  test  was  derived  under  the 
assumption  that  H   is  true.   Another  approach  to  comparing  nonnested  models 
is  to  allow  both  models  to  be  misspecified  under  the  null.   Rossi  (1985) 
offers  a  Bayesian  approach  to  model  selection  when  neither  model  is  assumed 
to  be  true.   Vuong  (1989)  considers  a  generalized  likelihood  ratio  approach 
which  assumes  that  neither  model  is  correctly  specified  under  H  but  that,  in 
a  well-defined  statistical  sense,  they  explain  the  data  equally  well. 

A 

Before  turning  to  some  examples,  note  that  /3   can  be  any  estimator  such 
that  /T(£   -  /3  )  =  0  (1)  for  some  sequence  Ifi    }    C  int(B)  uniformly  in  T.   If 
(/3  )  does  not  have  the  optimality  properties  based  on  (2.5)  then  the  test 
statistics  is  not  derivable  from  (3.3)  -  (3.5).   Nevertheless,  the  test  based 
on  Procedure  3.1  could  be  a  useful  diagnostic. 

4.  Examples  of  Nonnested  Tests 

Because  the  heteroskedasticity-robust  Lagrange  Multiplier  statistic  has 
been  considered  elsewhere  (Davidson  and  MacKinnon  (1985) ,  Wooldridge 
(1987a)),  this  section  focuses  on  the  application  of  CME  tests  to  model 
specification  testing  in  the  presence  of  nonnested  alternatives. 


10 


Example  4.1:   The  most  well-known  application  of  nonnested  hypotheses  testing 

is  to  two  competing  linear  models  with  different  regressors.   In  particular, 

m  (x  ,q)  =  x  ,q  (^-1) 

t   t       tl 

Mt(xt,/?)  =  xt20  (4.2) 

where  x  .  and  x  .  are  lxP  and  1x0  subvectors  of  x  ,  with  lag  lengths 
tl       t2  x  t '  b  o 

independent  of  t.   Assume  that  there  are  a  sufficient  number  of  past 

observations  to  start  the  indexing  in  (4.1)  and  (4.2)  at  t  =  1.   Let  w   be 

the  lxM  vector  of  regressors  in  x  .  but  not  x  ,  .   Then  the  form  of  the  test 

6  t2  tl 

which  assumes  homoskedasticity  in  addition  to  H   (see  (3.8))  is  simply  the  LM 

test  for  6      =  0  in  the  model 
o 

E(yt|xt)  -xtlao  +wt25o     t-1,2,....  (4.3) 

Under  H'   the  LM  test  is  asymptotically  equivalent  to  the  standard  Wald  test 

for  exclusion  of  w     In  models  with  nonrandom  regressors,  the  F-statistic 

as  a  test  in  the  presence  of  nonnested  hypotheses  has  been  studied 

extensively  by,  among  others,  Ericsson  (1983)  and,  more  recently,  as  a 

special  case  of  the  CPE  test  by  Mizon  and  Richard  (1986).   The  CME  test  is 

the  same  whether  or  not  x   contains  lagged  dependent  variables  or  other 

random  regressors.   To  ensure  that  the  test  has  correct  asymptotic  size  in 

the  presence  of  heteroskedasticity ,  Procedure  3.1  can  be  applied  with  V  m  = 

J  a    t 

x  n  and  A   =  w  „    B 

tl       t     t2 

Example  4.2:   Suppose  that  y  >  0,  and  consider  the  following  competing 

models  for  E(v  |x  ) : 
J  t'    t 

ni  (x^,a)  =  w  q  (4.4) 

t      L.  t 

^(x  ,£)  =  exp(wfi)  (4.5) 

1-    t  u 
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where  w   is  lxP.   Again,  in  a  time  series  context,  assume  that  w  has  a  lag 
length  independent  of  t.   Note  that  even  though  y  >  0  the  linear  model  (4.4) 
cannot  be  ruled  out  a  priori.      In  contrast,  a  normality  assumption  for  y   is 
untenable,  and  so  it  is  not  imposed  under  either  model. 

If  the  linear  model  is  taken  as  the  null  and  homoskedasticity  is 

2 
maintained,  the  CME  test  is  an  LM-type  test  based  on  TR   from  the  regression 

A  A 

et   °n  wt,  exp(w  /3  )w       t=l,...,T,  (4.6) 

A  A  A 

where  e^  ■  y^  -  w  q_  and  a_  is  the  OLS  estimator  of  a   under  H_ .   Under  H_ 
t   J  t     t  T      T  o        0  U 

9  A        0 

and  homoskedasticity,  TR  -+  v  .   Because  homoskedasticity  is  not  always  a 

reasonable  assumption  for  nonnegative  economic  variables,  the 

heteroskedasticity-robust  approach  might  be  particularly  useful.   In 

Procedure  3.1  simply  set  V  m  =  w   and  A   =  exp(w  /L,)w  .  m 
K  J  at  t       t     FV  tpT   t 

Example  4.3:   Frequently  researchers  are  interested  in  comparing  linear  and 

log-linear  regression  models.   Although  this  is  certainly  in  the  spirit  of 

comparing  linear  and  exponential  forms  for  E(y  |x  ),  linearity  of 

E(log  y  |x  )  need  not  imply  that  E(y  |x  )  has  an  exponential  form,  nor  vice 

versa.   To  compare  linear  and  log- linear  models  a  further  assumption  is 

needed  under  the  log- linear  model.   Many  tests  assume  that 

2  2 

log   y    lx^  N(w   5    ,a    ) ,    some    8      €   A,    some    a      >   0 .  (4.7) 

t      t  -  too  o  o 

Here  it  suffices  to  make  the  weaker  assumptions 

E(yt|xt)  =  exp(wtQo)  (4.8) 

S(yt|xJ    =   exp[KQ   +   E(log  yt|xt)]  (4.9) 

2 
for   some  k     >  0.      If    (4.7)    holds    it    is  well   known  that  k     -  a  /2    in    (4.9). 

O  0  0 
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The  conditional  mean  of  log  y   under  (4.8)  and  (4.9)  is 


E(log  yt|xt)  -  aQl    -    kq   +  a^w^  +  ...  +  aopwtp 


■  w  6     . 
t  o' 


whe 


re  it  has  been  assumed  that  w   =   1.   Testing  the  log- linear  model  against 


the  linear  alternative  is  very  simple.   First,  let  8      and  log  y  be  the  OLS 
estimator  and  fitted  values  from  the  regression 

log  yt  on  wt     t=l T. 

Compute  an  estimate  of  exp(/c  )  and  predicted  values  of  y   from  the  OLS 
regression  of  y   on  exp[log  y  ]  (without  an  intercept).   Let  k.      and  y   be  the 
estimator  of  k      and  the  fitted  values  of  y  ,  respectively,  and  define  the 

A  A  A  A 

residuals  as  e   =  y   -  y  .   Then  simply  apply  Procedure  3.1  with  V  m  =   y  w 

A 

and  A   =  w  .   Note  that  the  computations  required  for  the  test  can  be  done 
entirely  by  linear  least  squares  regressions.   Also,  the  implicit  estimator 


A         A 


for  qq  ,  a     =  (5  1+/c  ,5  „  ,  .  .  .  ,  5   )  ,  has  no  particular  optimality  properties 
even  under  conditional  homoskedasticity ,  yet  the  test  is  asymptotically 
equivalent  to  the  procedure  which  uses  the  NLS  estimator  of  a  . 

This  test  requires  only  the  additional  assumption  (4.9)  to  compare 
linear  and  log- linear  regression  models,  and  not  the  stronger  assumption 
(4.7).   If  (4.7)  is  believed  to  be  true  then  this  test  cannot  be  optimal;  the 
only  information  about  y   that  is  used  is  the  exponential  form  of  the 
conditional  expectation,  so  that  additional  information  about  the  conditional 
distribution  of  y   given  x   is  ignored.   The  strength  of  the  current  approach 
is  that  it  does  not  require  distributional  or  second  moment  assumptions  under 
either  model.   The  Cox  (1961,1962)  test,  which  requires  distributional 
assumptions  under  H   as  well  as  K. ,  can  be  quite  difficult  to  compute  (see 
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Aneuryn- Evans  and  Deaton  (1980)). 

The  procedure  that  takes  the  linear  model  as  the  null  hypothesis  (and 
does  not  impose  distributional  or  variance  assumptions  under  the  null)  is  the 

A 

same  as  Example  4.2,  except  that  ft      is  constructed  from  a  log- linear  OLS 
regression  as  above  rather  than  NLS .   Compared  with  procedures  that  impose  a 
plausible  distribution  of  y  |x   in  the  linear  model,  the  current  approach  is 
more  robust  and  computationally  much  easier.   H 

5.  Extension  to  Weighted  Nonlinear  Least  Squares 

The  approach  of  section  3  extends  directly  to  the  case  where  one  or  both 
models  are  estimated  by  weighted  NLS  (WNLS) .   Let 

(mt(xt,a):  a  e  A)       tht(xt ■"*> :  1  e  T]  (5-L) 

and 

lA*t(xt,0):  p   6  B)      (r?t(xt,5):  6   e   A)  (5.2) 

be  the  "competing"  models,  where  h   and  tj      are  weighting  functions  such  that 
h-t(x  7)  >  0,  rj    (x  6)    >   0.   It  is  important  to  stress  that  the  null 
hypothesis  is  the  same  as  in  section  3,  i.e. 

H  :  E(y  |x  )  =  m  (x  .a  )    for  some  a   S  A,  t=l , 2 , . . . .  (5.3) 

u      t   t      t   t   o  o 

It  is  not  assumed  that  h^(x^,7)  is  a  correctly  specified  parameterized 

version  of  the  conditional  variance  of  y  given  x  under  H„  (i.e.  it  is  not 

-'t  &      t        0 

assumed  that  b.  (x  .7  )  is  proportional  to  V(y  Ix  )  for  some  7  S  T)  . 

ttO       rr  w  t  I   t/  i0 

Instead,  assume  that  there  are  estimators  of  the  nuisance  parameters  7  and  5 
such  that 
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/T(7T  -  1°)    =   0(1),     A(5   -  6°)    =  0(1)  (5.4) 

lip  I         1  P 

where  (7T)  and  (5  }  are  nonstochastic  sequences.   First  suppose  that  a      is 

the  WNLS  estimator  that  solves 

T 
min  I    (y   -  mt(xt  ,cr)  )2/ht(xt ,  7  )  .  (5.5) 

qGA  t=l 

The  WNLS  estimator  based  on  model  (5.2)  solves 

T 
min  I    (y   -  u  (x  J))2/r,  (x  i  ).  (5.6) 

0£B  t«l 

A 

The  solution  to  (5.6),  again  denoted  /?   is  generally  such  that 

A(/3T  -  /3°)  =  0(1)  (5.7) 

where  /3   solves  the  nonstochastic  minimization  problem 

T 
min  I   E[(y   -  u  (x  ,/?) )  A?  (x  ,  5°)  ]  . 
peB   t-1 

Under  H 

E[(Yt  "  ^t(xt,/3))2At(xt,5°)]  (5.8) 

=  E[(mt(ao)-  Mt(y9))2/'?t(5°)]  +  E[  (e°)2/r,t(S°)  1  • 
The  appropriate  first  order  condition  for  /3   is 

T-^E[yt(^)'(mt(ao).  Mt(^))At(5°)]  -  0 

and  the  relevant  statistic  is 

T 

T 

=     T"1!    [h^(7    )"1/2(h    (7    )A    (5„))V   u^(fl    )]'h^(7T)"V2er 
,      ti  tltlpti  tl  t 

T 
-     T_1X    [h;1/2AT]'h"1/2e^  (5.9) 


t=l      ^  TJ       t  t 
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where  e   =  y   -  m  (a„,)  and 


A       A 


xt-  (WW  (5-10) 

A  A 

If  the  models  are  nested  and  h   =  tj      then  (5.9)  leads  to  a  statistic  that  is 
asymptotically  equivalent  to  the  usual  LM  statistic  in  the  context  of  WNLS . 

More  generally,  (5.9)  suggests  basing  a  test  on  the  correlation  of  the 

A-1/2A  A-1/2A 

weighted  indicator  h     A   and  the  weighted  residuals  under  H„ ,  h     e  . 

A 

Note  that  the  indicator  A   is  a  particular  weighting  of  the  gradient  of  the 
alternative  regression  function.   If  instead  of  Hn  the  null  hypothesis  is 

2 
H":  H_  holds,  and  for  some  7   e  I\  a      >  0,  (5.11) 

U    U  00 

V(y  |x  )  =  a\    (x  ,7  )  ,   t-1,2,  .  .  . 
wt'  t     o  t   t'  Jo'  ' 

A 

then,  assuming  now  that  7   is  a  /T-consistent  estimator  of  7  ,  a  simple 

regression  test  using  the  weighted  residuals  as  the  dependent  variable  is 

A-1/2A  ~  A-l/2      A  ~  A-l/2" 

available.       Let   e^=h      /e,Vm      =h      '    V  m    ,    and   X      =  h      '     \      = 
t  t  tat  t  at  t  t  t 

A  -1    .  r\  A  A 

(h   /rlt)^n^r-      The  LM-like  test  is  obtained  by  running  the  regression 

S    on   V  m  ,  A       t=l T  (5.12) 

t       q  t    t         ' 

2  2 

and  using  TR   as  asymptotically  v  under  H"  (assuming  no  redundancies  in  A  ) . 

The  following  procedure  is  valid  whether  or  not  (h  (x  ,7):  7  e  T)    contains  a 


version  of  V(y  |x  )  under  H 


PROCEDURE  5.1: 

A 

(i)   Let  q  be  any  /T-consistent  estimator  of  a     under  H_ ,  and  let 

A  A  A         A  A         A 

P      be    an   estimator    such    that   /T(/8„,    -  ' B°)    =0    (1).       Compute   a_ ,    h    ,    V   m    ,    e    , 
•L  TTp  T         t        a    c         t 

AA  A  A  AA  A  A         -.       m  rt  A 

PT.    V        V   u      and   A^   -    (h   /r,    )V   «    (fl    )  .       Define   I      -  h"    /    e        V  5     - 
1         t        p    t  C  ttpti  t  t  tac 

h"1/2V  m  ,  and  A   =  h"1/2A  ; 

t       Q   t  t       t       t 
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(ii)  Run  the  regression 


A    on   V  m     t=l , . . ,T 
t        a  t 

and  save  the  residuals,  say  f  ; 

(iii)   Run  the  regression 

1   on  et?t      t-1 T 

2  2 

and  use  TR  =  T  -  SSR  as  asymptotically  v->  under  H. .   Again,  delete  any 

redundant  elements  in  A   and  reduce  the  degrees  of  freedom  as  needed. 

t  & 

The  weighted  NLS  extension  allows  simple  robust  tests  for  a  wide  variety 
of  models.   In  particular,  quasi-maximum  likelihood  estimation  (QMLE)  of  a 
linear  exponential  family  is  accomodated  because  the  QMLE  is  asymptotically 
equivalent  to  a  particular  WNLS  estimator  with  estimated  weights  (see 
Gourieroux,  Monfort,  and  Trognon  (1984)). 

Being  agnostic  about  whether  the  family  {h  (x  ,7):  7  e  D  contains  the 
conditional  variance  of  y  under  H   allows  for  possible  improvements  over  NLS 
(although  this  is  in  no  way  guaranteed!)  while  guarding  against  inference 
with  incorrect  asymptotic  size  due  to  a  misspecif ied  variance.   Moreover, 
nothing  is  lost  in  terms  of  local  power  if  the  weighting  function  happens  to 
be  correctly  specified  for  the  conditional  variance.   In  other  words,  the 
robust  procedure  is  optimal  (in  the  class  of  WNLS  procedures)  if  h  (x  7)  is 
a  correctly  specified  version  of  V(y  |x  ),  and  so  h  (x  ,7)  should  reflect  the 

researcher's  best  guess  for  V(y  |x  ). 

t   t 

As  an  illustration,  consider  the  analysis  of  count  data.   One  might 
believe  that  a  conditional  Poisson  distribution  provides  a  better 

approximation  to  the  second  moment  of  y   than,  say,  the  assumption  of 

t 

homoskedasticity .   However,  the  assumption  that  the  conditional  mean  and 
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variance  are  equal  (or  proportional)  is  not  one  on  which  conditional  mean 
specification  testing  should  rely.   And  if  the  mean  and  variance  do  happen  to 
be  equal,  nothing  is  lost  asymptotically  by  using  the  robust  procedure. 

If  the  nonrobust  tests  had  the  ability  to  systematically  detect 
violation  of  the  conditional  variance  assumption  then  the  nonrobustness 
criticism  would  be  somewhat  mitigated.   However,  the  nonrobust  conditional 
mean  tests  (and  the  robust  forms  proposed  in  this  paper)  are  inconsistent 
against  the  alternative 

H£:  HQ  holds  but  V(y  |x  )  *  a\    (x  ,7)  for  all  7  e  T,    a2   >  0.    (5.13) 

Consequently,  one  should  not  expect  to  detect  departures  from  the  conditional 
variance  assumption  by  using  nonrobust  conditional  mean  tests.   Under  H   the 
actual  size  of  the  nonrobust  test  can  be  larger  or  smaller  than  the  nominal 
size,  and  it  is  difficult  if  not  impossible  to  determine  a  priori   which  is 
likely  to  be  the  case. 

The  asymptotic  local  distribution  of  the  CME  test  for  WNLS  is  analogous 


to  the  NLS  case.   The  indicator  A   =  (h  /n    )V„u  now  replaces  V„u  .   With 

t   v  V   't'  p^t  F      pt 

this  modification  the  same  calculation  works  if  e   and  £   are  simply  weighted 


by  h 


t 
Turning  now  to  an  example  of  an  CME  test  for  a  weighted  NLS  problem, 

again  consider  testing  an  exponential  versus  linear  regression  model. 

Example  5.1:   The  competing  models  are 

m  (x  ,q)  =   exp(w^a)  (5.14) 

u  (x  ,/?)  -  w  B.  (5.15) 
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Suppose  that  the  test  which  takes  the  exponential  model  as  the  null  is  to  be 
based  on  a  weighted  sum  of  squared  residuals.   In  particular,  let  the 

weighting  function  be  the  square  of  the  regression  function:   h  (x  ,7)  = 

2  2 

[exp(w  a) ]    .      It  is  important  to  stress  that  [exp(w  a  )]   is  not  assumed  to 

be  proportional  to  V(y  |x  )  under  Hn ,  although  this  of  course  is  not  ruled 

out. 

A 

The  estimator  a     can  be  the  NLS  estimator,  or  the  WNLS  estimator  which 

solves 

1  0  A 

min  I  (yt  -  exp(w  q))  /h^. 
qgA  t=l 

In  the  context  of  Example  4.3  under  (A. 8)  and  (4.9),  a  computationally 

convenient  estimator  is  obtained  from  the  log-linear  regression.   For  any 

A. 

■/T- consistent  estimator  a     define  the  weighted  residuals  and  weighted 

-    "-1/2A     -    A-l/2   *  A 

gradient  ase   =  h    e  ,  V  m  =  h  V  m  -  exp(-w  cO  exp(w  q_)w  -  w    The 

6  t     t     tat     t    a  t     ^    t  T  '      *      t  T   t     t 

"-1/2 
indicator  is  A   "  w  ,  and  the  weighted  indicator  is  A   s  h    w^  = 
t    t  &  t    t    t 

exp(-w  a   )w  .   These  quanitities  are  then  used  in  Procedure  5.1. 

Note  that  in  the  setup  of  Example  4.3  all  computations  can  be  carried 

out  by  OLS .   Also,  the  weights  can  be  easily  computed  as  h  = 

2 
[exp(log  yt)]  ■   " 

Consideration  of  weighted  NLS  introduces  a  possibility  not  allowed  in 

the  framework  of  section  3.   Provided  that  h  and  n      are  sufficiently 

t      't  J 

different  one  can  take  a  =  /}  and  /x  (x  ,fl)    -°   m  (x  ,a).   That  is,  suppose  one 
does  not  have  a  particular  alternative  to  m^  in  mind  (either  nested  or 
nonnested) ,  but  instead  another  WNLS  estimator  is  used  to  detect 
misspecification  of  m_ .   This  application  of  the  Durbin-Wu-Hausman 
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methodology  has  been  considered  by  White  (1980)  in  the  context  of  NLS  on 
cross  section  data.   It  can  be  shown  that,  when  h  (x  ,7)  is  correctly 
specified  for  V(y  |x  ),  the  statistic  obtained  from  the  regression  (5.12)  is 
asymptotically  equivalent  to  the  DWH  statistic  that  compares  the  difference 
of  the  two  WNLS  estimators  and  exploits  the  relative  efficiency  of  the  WNLS 

A 

estimator  based  on  h   (for  a  similar  result,  see  Ruud  (1984)).  The  robust 

approach  obtained  by  setting  \x      -   m   in  Procedure  5.1  does  not  require  either 

estimator  to  be  relatively  efficient  under  H„ ,  but  it  is  still  asymptotically 

equivalent  to  the  usual  DWH  test  in  the  event  that  h  (x  ,7)  is  correctly 
specified  for  V(y  |x  ). 

6.  Comparison  with  Other  Related  Nonnested  Hypotheses  Tests 


Davidson  and  MacKinnon  (1981)  (DM)  suggested  a  method  for  testing 
nonnested,  nonlinear  regression  models  which  has  proven  to  be  useful  in 
practice.   Their  approach  can  be  derived  from  the  general  framework  of  Cox 
(1961,1962)  under  normality  and  homoskedasticity . 

In  the  notation  of  this  paper,  the  DM  statistic  is  obtained  by  testing 

6      =   0  in  the  artificial  model 
o 

A 

y  =  (1-5  )m  (x  q  )  +  S    u  (x  J  )   +  error  (6.1) 

The  LM  form  of  the  test  is  particularly  convenient  since  it  requires  only  NLS 
estimation  of  each  model  and  then  one  auxiliary  OLS  regression.   Let  e   =  y 

A 

-  m  (x  a    )  be  the  residuals  from  the  model  under  H  .   Then  the  LM  approach 

2 
is  to  compute  R   from  the  regression 

u  ° 

A  AAA 

e    on  V  m.  ,   u   -  rri     t-1 T  (6.2) 

t  Q  t       t       t 
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2  2 

and  use  TR   as  asymptotically  x-,  under  H„ .   Thus,  the  DM  test  is  simply  an 

A  A 

omitted  variables  test  of  u   -  m   in  the  nonlinear  model 

't    t 

y=m(x,a)+e.  (6.3) 

yt  tK    t'  oJ  t 

The  standard  DM  test,  as  well  as  the  LM  form  in  (6.2),  is  invalid  in  the 

presence  of  heteroskedasticity .   A  robust  version  can  be  computed  by 

A 

modifying  the  misspecif ication  indicator  in  Procedure  3.1:   simply  set  A  = 

A  A  A 

H      -   m   (see  Wooldridge  (1987b)).   Because  the  robust  version  allows  a   and 

A 

P      to  be  any  /T-consistent  estimators,  a  DM  test  for  the  log-linear  versus 
linear  model  can  be  computed  entirely  with  OLS  along  the  lines  of  Example 
4.3. 

A  robust  DM  test  based  on  WNLS  estimation  (and  therefore  for  QMLE  in  a 
linear  exponential  family)  is  also  easy  to  obtain.   Let  the  mean  and 
weighting  functions  be  given  by  (5.1)  and  (5.2),  with  the  null  hypothesis 

A  A 

taken  to  be  (5.3).   Then  the  indicator  X      for  the  DM  test  is  the  scalar  A  = 

t  t 

AAA  A 

(h  /r)    )  (u   -  m  ).   Note  that  the  same  reweighting  of  the  indicator  that 
appears  in  the  CME  test  also  shows  up  in  the  DM  test  for  weighted  nonlinear 
regressions.   This  misspecif ication  indicator  is  used  in  Procedure  5.1  in 

A       A  A 

place  of  (h  /ri    )V „u  . 

Even  though  the  DM  test  is  only  a  one  degree  of  freedom  test,  it  is 
always  consistent  against  the  alternative  H1  .   In  the  case  of  unweighted  NLS , 
consistency  of  the  test  follows  if  it  can  be  shown  that,  under  H..  , 


I   1  T 
lim  inf   T   I      E[(ax  (0  )  -  m  (er°)  )e_(cr°)  ] 

T-KO  t=l 


>  0.  (6.4) 


But  e  (a  )  =  y^  -  m^(a°)  =  «„(5  )  +  »AP    )  -  ^(a°),  where  e  (0)  -  yr 

l-J-        i_i_1       tO        oO        CI  t  t 

*V(x  ,0).   Under  H1 ,  E(e°|x  )  =0,  so  that 

c   »-  l     t   t 
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E[(Mt(/3o)  -  mt(Q°))et(Q°)]  =  EUp^fiJ    -  ^("J))2], 

and  (6.4)  holds  except  in  uninteresting  degenerate  situations. 

The  CME  test  has  degrees  of  freedom  that  depend  on  the  dimension  of  /3  in 
the  alternative  model;  without  redundancies,  the  degrees  of  freedom  equals 
the  dimension  of  /3 .   The  condition  for  consistency  of  the  CME  test  against  H. 


is 


lim  inf 
T->co 


T 


T"1^  E[VA(^)'et(a°) 


>  0,  (6.5) 


where  | • |  now  denotes  Euclidean  norm.   Under  H  , 

E[ytc/»0>'.t(«j)]  -  E[yt^0)'(\^0)  -  %(-t))] 

and  the  condition  for  consistency  reduces  to 

1  T  I 

lim  inf  T   I     E[V  Mt(0  )'(Mt03  )  -  m  (O)]   >  0.        (6.6) 

T-*co        t=l  "  I 

For  general  u   and  m   it  is  possible  for  (6.6)  to  fail.   Nevertheless,  for 
linear  models,  (6.6)  holds  except  in  degenerate  situations.   Also,  when  m 
and  n     are  linear  or  exponential  functions,  (6.6)  holds  provided  that  the 
regressors  contain  a  constant.   Consistency  is  easy  to  establish  for  the 
LM-type  tests  that  employ  the  NLS  estimators  since  the  regressors  in  the  DM 
auxiliary  regression  are  linear  combinations  of  the  regressors  in  the  CME 
test  regression.   To  verify  consistency  of  the  CME  test  for  the  more  general 
robust  procedure,  (6.6)  can  be  demonstrated  directly  for  all  of  the  examples 
in  section  4. 

The  dimension  of  the  space  of  alternatives  against  which  the  CME  test  is 
consistent  is  greater  than  the  corresponding  dimension  for  the  DM  test.   When 
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the  alternatives  of  interest  consist  only  of  the  specified  competing  model 
then  the  one  degree  of  freedom  test  may  be  adequate.   However,  as  a  general 
model  diagnostic,  the  DM  test  may  not  have  sufficient  power  against  certain 
alternatives  of  interest.   For  more  on  the  issue  of  the  "implicit  null 
hypothesis"  of  a  test  the  reader  is  referred  to  Pesaran  (1982) ,  MacKinnon 
(1983,  with  discussion),  Mizon  and  Richard  (1986,  section  4),  and  Davidson 
and  MacKinnon  (1987).   Characterizing  the  implicit  null  in  a  useful  manner 
for  the  CME  test  in  nonlinear,  dynamic  models  is  difficult  and  is  necessarily 
done  on  a  case  by  case  basis.   Mizon  and  Richard  (1986,  section  4)  find  the 
implicit  null  of  the  DM  and  CPE  tests  for  competing  linear  models  with 
strictly  exogenous  regressors;  the  same  calculation  works  for  the  CME  test  in 
this  case. 

As  mentioned  several  times  above,  Mizon  and  Richard  (1986)  develop  the 
notion  of  complete  parametric  encompassing  tests  and  discuss  how  they  can  be 
applied  to  testing  nonnested  hypotheses.   The  CPE  tests  are  closely  related 
to  the  tests  of  Gourieroux,  Monfort,  and  Trognon  (1983):   both  approaches 
rely  on  the  notion  of  a  pseudo-true  value  in  the  alternative  model.   The 
tests  derived  by  MR  and  GMT  lead  to  well-known  tests  in  nested  situations, 
and  are  similar  in  spirit  to  the  tests  derived  here.   To  compare  the  CPE 
tests  and  the  CME  tests  the  CPE  principle  must  be  extended  to  the  case  where 
the  conditional  distribution  of  y   given  x  and  joint  distribution  of 
(y-i  .  z.  ),...,  (y   z  )  are  not  completely  specified.   Letting  b  (a  )  denote  the 
pseudo-true  value  of  f}   under  H_ ,  the  Wald  encompassing  test  (WET)  is  based  on 

T1/2(£T  -  bT(QT)).  (6.6) 

It  can  be  shown  that  when  the  regressors  are  treated  as  nonrandom  (or 
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strictly  exogenous),  (6.6)  and  (3.6)  are  asymptotically  equivalent  up  to 

multiplication  by  a  sequence  of  uniformly  positive  definite  matrices.   Thus, 

in  this  case,  the  CPE  and  CME  tests  are  asymptotically  equivalent,  and  the 

CME  test  can  be  viewed  as  a  computationally  simple  robust  version  of  the  CPE 

tests  of  Mizon  and  Richard  (1986)  and  GMT  (1983).   The  equivalence  breaks 

down  for  general  dynamic  models,  partly  because  the  pseudo-true  value 

function  becomes  a  complicated  function  of  the  parameters  of  the  distribution 

of  y   (including  a) .   In  fact,  strictly  speaking,  the  CPE  tests  as  developed 

by  Mizon  and  Richard  (1986)  cannot  be  computed  for  all  cases  considered  here 

because  the  null  hypothesis  in  this  paper  only  specifies  E(y  |x  ),  whereas 

the  derivative  of  the  pseudo-true  value  function  can  depend  on  the  joint  as 

well  as  the  conditional  distribution  of  y  given  x  .   As  this  function  is 

Jt   6      t 

needed  to  compute  the  limiting  distribution  of  the  CPE  statistic,  one  must 
specify  more  than  E(y  |x  )  under  H    Nevertheless,  in  some  cases  the 
"natural"  way  of  operationalizing  a  CPE  test  leads  to  a  test  asymptotically 
equivalent  to  the  corresponding  CME  test.   If  the  alternative  model  is 
linear,  e.g.  u  (x  ,/3)  =  w  B,    it  is  sensible  to  choose 


r    T      -,  -1  T 


Vo>  ^ 


t=l 


7  w'  w     V  w'  m  ( x  ,  q  ) 
u.    t  t    A,  t  tv  t'  o/ 


t-1 


(but  note  that  b  (a  )  is  generally  random).   In  this  case  it  is 
straightforward  to  show  that  (6.6)  and  (3.6)  lead  to  asymptotically 
equivalent  tests. 

When  the  CPE  test  and  the  CME  test  are  not  asymptotically  equivalent  it 
is  difficult  to  determine  analytically  which  has  superior  power  properties. 
It  is  unlikely  that  one  test  uniformly  dominates  the  other  in  terms  of 
asymptotic  local  power.   Comparing  the  powers  of  the  tests  in  situations 
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where  they  are  not  asymptotically  equivalent  requires  a  detailed  study  that 
is  beyond  the  scope  of  the  current  paper.   But  it  is  useful  to  note  that  when 
the  two  tests  are  not  asymptotically  equivalent  the  CME  tests  have 
significant  computational  advantages,  as  well  as  being  easy  to  "robustify." 

As  exposited  by  GMT  (1983)  and  Mizon  and  Richard  (1986),  the  CPE 
principle  has  broad  applicability.   However,  the  current  application  of  the 
encompassing  principle  also  applies  to  situations  more  general  than  WNLS . 
The  approach  used  in  sections  3  and  5  can  be  invoked  in  any  setting  where 
estimators  are  defined  through  optimization  problems,  including  the  general 
maximum  likelihood  setting.   But  in  more  general  settings  the  resulting  tests 
suffer  from  one  of  the  same  drawbacks  as  the  CPE  tests:   calculation  of  the 

A 

test  statistic  requires  estimation  of  the  asymptotic  variance  of  /T(/3   -  /3  ) 
under  H    Nevertheless,  the  approach  of  this  paper  never  requires  one  to 
find  or  even  to  characterize  in  any  way  the  pseudo-true  value  function.   The 
extension  to  general  (quasi)  maximum  likelihood  estimation  is  left  to  future 
work,  primarily  because  the  the  statistics  would  no  longer  be  very  easy  to 
compute . 

7 .  Conclusions 

The  conditional  mean  tests  developed  in  this  paper  are  applicable  to 
testing  nested  and  nonnested  hypotheses  for  cross  section  or  dynamic 
conditional  means.   There  are  several  attractive  features  of  these  test. 
First,  they  can  be  computed  by  using  linear  least  squares  regressions  after 
the  original  estimation.   Second,  they  do  not  require  horr.cskedasticity  cr 
other  second  moment  assumptions ,  and  can  be  computed  using  any  /T- consistent 
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estimators.   Finally,  the  CME  tests  are  asymptotically  equivalent  to  well 
known  tests  in  special  cases,  such  as  the  LM  test  for  nested  models  and  the 
Durbin-Wu-Hausman  test  for  comparing  two  WNLS  estimators  of  the  same 
parameters . 

Further  work  needs  to  be  done  to  investigate  the  finite  sample 
properties  of  the  statistics  proposed  here.   Ericsson  (1983)  has  compared  the 
powers  of  the  regression  F-test  mentioned  in  Example  4.1  to  the  DM  test,  and 
the  F-test  compares  favorably  for  many  alternatives.   One  might  expect  the 
CME  tests  to  perform  well  in  more  general  nonlinear,  dynamic  models,  but  this 
remains  to  be  seen.   In  addition,  it  would  be  useful  to  further  investigate 
the  relationship  between  the  complete  parametric  encompassing  tests  of  Mizon 
and  Richard  (1986)  and  the  CME  tests. 

The  nonnested  tests  extend  easily  to  the  case  of  more  than  one 
alternative  regression  function.   One  merely  includes  the  gradients  (or 
weighted  gradients)  of  all  competing  models  as  indicators.   The  same 
regression  procedures  are  still  appropriate.   Also,  CME  tests  can  be  derived 
in  a  straightforward  manner  for  multivariate  models  that  are  estimated  by 
multivariate  WNLS. 


26 


References 

Aneuryn- Evans ,  G.,  and  A.  Deaton,  1980,  Testing  Linear  Versus  Logarithmic 
Regression  Models,  Review  of  Economic  Studies  47,  275-291. 

Cox,  D.R.,  1961,  Tests  of  Separate  Families  of  Hypotheses,  in  Proceedings  of 
the  Fourth  Berkeley  Symposium  on  Mathematical  Statistics  and 
Probability,  Vol.  1,  105-123. 

Cox,  D.R.,  1962,  Further  Results  on  Tests  of  Separate  Families  of 

Hypotheses,  Journal  of  the  Royal  Statistical  Society  Series  B  24, 
406-424. 

Davidson,  R. ,  and  J.G.  MacKinnon,  1981,  Several  Tests  of  Model  Specification 
in  the  Presence  of  Alternative  Hypotheses,  Econometrica  49,  781-793. 

Davidson,  R.  and  J.G.  MacKinnon,  1985,  Heteroskedasticity-Robust  Tests  in 
Regression  Directions,"  Annales   de   1'INSEE,    59/60,  183-218. 

Davidson,  R.  and  J.G.  MacKinnon,  1987,  Implicit  Alternatives  and  the  Local 
Power  of  Test  Statistics,  Econometrica  55,  1305-1329. 

Durbin,  J.  (1954),  Errors  in  Variables,  Review  of  the  International 
Statistical  Institute  22,  23-32. 

Ericsson,  N.R.,  1983,  Asymptotic  Properties  of  Statistics  for  Testing 
Nonnested  Hypotheses,  Review  of  Economic  Studies  50,  287-304. 

Gourieroux,  C,  A.  Monfort,  and  A.  Trognon,  1983,  Testing  Nested  or 
Nonnested  Hypotheses,  Journal  of  Econometrics  21,  83-115. 

Gourieroux,  C. ,  A.  Monfort,  and  A.  Trognon,  1984,  Pseudo  Maximum  Likelihood 
Methods:   Theory,  Econometrica  52,  681-700. 

Hausman,  J. A.,  1978,  Specification  Tests  in  Econometrics,  Econometrica  46, 
1251-1271. 

Hendry,  D.F.,  and  J.-F.  Richard,  1982,  On  the  Formulation  of  Empirical 
Models  in  Dynamic  Econometrics,  Journal  of  Econometrics  20,  3-33. 

Mizon,  G.E.,  1984,  The  Encompassing  Approach  in  Econometrics,  in 

Econometrics  and  Quantitive  Modelling,  ed.  D.F.  Hendry  and  K.F.  Wallis. 
Oxford:   Basil  Blackwell. 

Mizon,  G.,  and  J.-F.  Richard,  1986,  The  Encompassing  Principle  and  Its 

Application  to  Testing  Nonnested  Hypotheses,  Econometrica  54,  657-578. 

Pesaran,  M.H.,  1982,  Comparison  of  Local  Power  of  Alternative  Tests  of 
Non-nested  Regression  Models,  Econometrica  50,  1287-1305. 


27 


Pesaran,  M.H.,  1987,  Global  and  Partial  Non-nested  Hypotheses  and  Asymptotic 
Local  Power,  Econometric  Theory  3,  69-97. 

Rossi,  P.E.,  1985,  Comparison  of  Alternative  Functional  Forms  in  Production, 
Journal  of  Econometrics  30,  345-361. 

Ruud,  P. A. ,  1984,  Tests  of  Specification  in  Econometrics,  Econometric 
Reviews  3,  211-242. 

Vuong,  Q.H.,  1989,  Likelihood  Ratio  Tests  for  Model  Selection  and  Non-nested 
Hypotheses,  Econometrica  ,  forthcoming. 

White,  H.,  1980,  Nonlinear  Regression  on  Cross  Section  Data,  Econometrica 
48,  721-746. 

White,  H.,  1987,  Specification  Testing  in  Dynamic  Models,  in  Advances  in 

Econometrics,  ed.  T.F.  Bewley.   Cambridge:   Cambridge  University  Press, 
1-58. 

White,  H.,  and  I,  Domowitz,  1984,  Nonlinear  Regression  with  Dependent 
Observations,  Econometrica  52,  143-162. 

Wooldridge,  J.M.,  1987a,  A  Regression-based  Lagrange  Multiplier  Statistic 

that  is  Robust  in  the  Presence  of  Heteroskedasticity ,  MIT  Department  of 
Economics  Working  Paper  478. 

Wooldridge,  J.M.,  1987b,  Specification  Testing  and  Quasi-Maximum  Likelihood 
Estimation,  MIT  Department  of  Economics  Working  Paper  479. 

Wooldridge,  J.M.,  1988,  A  Unified  Approach  to  Robust,  Regression-Based 

Specification  Tests,  MIT  Department  of  Economics  Working  Paper  480. 

Wu,  D.,  1973,  Alternative  Tests  of  Independence  Between  Stochastic  Regressors 
and  Disturbances,  Econometrica  41,  733-750. 


I  \        28 


3    ^flO   DOS   BIB   31 


6  -  M-e>*\ 


Date  Due 


NQV  12 

MAY  iu 


BARCODE 

ON  NEXT 

TO  LAST 

PAGE 


